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Abstract 

This paper considers fountain communication over discrete-time memoryless channels. We extend concatenated 

ON 

coding schemes to fountain systems and derive the achievable fountain error exponents for one-level and multi-level 

o 

I concatenated fountain codes. Encoding and decoding complexities of the concatenated fountain codes are linear in 

bJQl the number of transmitted symbols and the number of received symbols, respectively. Performances of concatenated 

. fountain codes in rate compatible fountain communication and fountain communication over an unknown channel 



are discussed. 



Index Terms 

coding complexity, concatenated codes, error exponent, fountain communication 

■ I. Introduction 

Fountain communication [1] is a new communication model originally proposed for reliable data 
^1 ! transmission over erasure channels. In a point-to-point fountain communication system, the transmitter 
O ' maps a message into an infinite sequence of channel symbols and sends them to the receiver. The receiver 

o\ : 

O decodes the message after the number of received symbols exceeds certain threshold. Due to random 
.£h symbol erasures, communication duration in a fountain system is determined by the receiver, rather than 
5_| ■ by the transmitter. The first realization of fountain codes was LT codes introduced by Luby [2] for erasure 
channels. LT codes can recover k information symbols from k + 0(\/k\n 2 (k/5)) encoded symbols at 
probability 1 — 5 with a complexity of 0(kln(k/8)), for any 5 > [2]. Shokrollahi proposed Rapter 
codes [3] by combining appropriate LT codes with a pre-code. Raptor codes can recover k information 
symbols from k(l + e) encoded symbols at high probability with complexity 0(Mog(l/e)). For erasure 
channels, both LT codes and Raptor codes can achieve optimum rate irrespective of the erasure statistics. 
Generalization of Raptor codes from erasure channels to binary symmetric channels (BSCs) was studied 
by Etesami and Shokrollahi in [4]. In [5], Shamai, Telatar and Verdii systematically extended fountain 
communication to arbitrary channels and showed that fountain capacity [5] and Shannon capacity take 
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the same value for stationary memoryless channels. Achievability of fountain capacity was demonstrated 
in [5] using a random coding scheme whose coding complexity is exponential in the number of received 
symbols. This consequently motivated the question whether fountain capacity of a stationary memoryless 
channel is achievable with a linear coding complexity. 

In classical point-to-point communication over discrete-time memoryless channels, Feinstein [6] demon- 
strated that communication error probability can be made to decrease exponentially in the codeword length. 
The corresponding exponent is known as the error exponent. Tight lower and upper bounds on error 
exponent were obtained by Gallager [7], and by Shannon, Gallager, Berlekamp [8], respectively. In [9], 
Forney proposed a one-level concatenated coding scheme that combines a Hamming-sense error correction 
outer code with Shannon-sense random inner channel codes. One-level concatenated codes can achieve 
a positive error exponent, known as Forney's exponent, for any rate less than Shannon capacity with a 
polynomial complexity [9]. Forney's concatenated codes were generalized by Blokh and Zyablov [10] to 
multi-level concatenated codes, whose maximum achievable error exponent is known as the Blokh-Zyablov 
error exponent. In [11], Gurusawmi and Indyk introduced a class of linear complexity near maximum 
distance separable (MDS) error-correction codes. By using Guruswami-Indyk's codes as outer codes in 
concatenated coding schemes, achievability of Forney's and Blokh-Zyablov exponents with linear coding 
complexity was proved in [12]. 

In this paper, we extend concatenated coding schemes to fountain communication over discrete-time 
memoryless channels, as modeled in Section HH Random fountain codes are briefly introduced in Section 
UTTl By defining error probability scaling law with respect to the number of received symbols, we derive 
in Section [IV] the error exponents achievable by one-level and multi-level concatenated fountain codes, 
and show that their encoding and decoding complexities are linear in the number of transmitted symbols 
and the number of received symbols, respectively. In Section |Vj we consider rate compatible fountain 
communication where part of the source message is known at the receiver. With the transmitter still 
encoding the complete message, we show that concatenated fountain codes can achieve the same rate and 
error performance as if only the unknown part of the message is encoded. We briefly discuss fountain 
communication over an unknown channel in Section |VIJ 

All logarithms in this paper are natural based. 

II. The Fountain Communication Model 

Consider the fountain communication system illustrated in Figure [TJ Assume the encoder uses a fountain 
coding scheme [5] with W codewords to map the source message w E {1,2,..., W} to an infinite channel 
input symbol sequence {x w i, x W 2, •••,}• Assume the channel is discrete-time memoryless, characterized 
by the conditional point mess function (PMF) or probability density function (PDF) p Y \x(y\x), where 
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Fig. 1. Fountain communication over a memoryless channel. 



x £ X and y G F are the input and output symbols, X and Y are the channel input and output alphabets, 
respectively. Define schedule Af = i 2 , . . . , as a subset of positive integers, where \Af\ is the 
cardinality of Af [5]. Assume the erasure device generates an arbitrary schedule Af, whose elements are 
indices of the received symbols {y W i i: y W i 2: . . . , y W i N }, where N = \Af\. We say fountain rate of the 
system is R = (logW)/N, if the decoder outputs an estimate w of the source message after observing N 
channel symbols, based on {y W i i: y W i 2 , ■ ■ ■ ,y W i N } and Af. Decoding error happens when w ^ w. Define 
error probability P e (N) as in [5], 

P e {N) = sup Pr{w j£ w\Af}. (1) 

Af,\Af\>N 

We say a fountain rate R is achievable if there exists a fountain coding scheme with lirriAr^oo P e (N) = at 
rate R [5]. The exponent rate at which error probability vanishes is defined as the fountain error exponent, 

E F (R), 

E F {R)= lim -1 log P e (N). (2) 

N^co iv 

Define fountain capacity Cp as the supremum of all achievable fountain rates. It was shown in [5] that 
Cf equals the Shannon capacity of a stationary memoryless channel. 

III. Random Fountain Codes 

In a random fountain coding scheme [5], encoder and decoder share a fountain code library C = {Cg : 
9 E 0}, which is a collection of fountain code books Cg with 9 being the index. All code books in 
the library have the same number of codewords and each codeword has an infinite number of channel 
input symbols. Let Cg(m)j be the j th codeword symbol of message m in Cg. To encode the message, the 
encoder first generates 9 according to a distribution ■&, such that the random variables x m j : 9 — > Cg(m)j 
are i.i.d. with a pre-determined input distribution p x [5]. Then the encoder uses codebook Cg to map the 
message into a codeword. We assume the actual realization of 9 is known to the decoder but is unknown 
to the erasure device. Maximum likelihood decoding is assumed. 

Theorem 1: Consider fountain communication over a discrete-time memoryless channel py\x- Let C F 
be the fountain capacity. For any fountain rate R < C F , random fountain codes achieve the following 
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Px 



random-coding fountain error exponent, E Fr (R). 

E Fr (R) = maxE FL (R,p x ] 
where E FL (R,p x ) is defined as follows 



E FL (R,p x ) = max A-pR + E (p,p x )}, 

0<p<l 



(3) 



/ _^\ (1+p) 
E (p,px) = -log22 5]px(x)py|x(2/|a;) 1+p ] 
y \ x / 



(4) 



If the channel is continuous, then summations in © should be replaced by integrals. ■ 
Theorem Q] was claimed implicitly in, and can be shown by, the proof of [5, Theorem 2]. 
E Fr (R) given in © equals the random-coding exponent of a classical communication system over 
the same channel [7]. For binary symmetric channels (BSCs), since random linear codes simultaneously 
achieves the random-coding exponent at high rates and the expurgated exponent at low rates [13], it can 
be easily shown that the same fountain error exponent is achievable by random linear fountain codes. 
However, because it is not clear whether there exists an expurgation operation, such as the one proposed 
in [7], that is robust to the observation of any subset of channel outputs, whether expurgated exponent 
is achievable for fountain communication over a general discrete-time memoryless channel is therefore 
unknown. 



IV. Concatenated Fountain Codes 
Consider a one-level concatenated fountain coding scheme illustrated in Figure El Assume source 
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Fig. 2. One-level concatenated fountain codes. 

message w can take exp(NR) possible values with an equal probability, where R is the targeted fountain 
information rate, and decoder decodes the source message after receiving N channel symbols. The encoder 
first encodes the message using an outer code into an outer codeword, {^1,^2, • ■ • ,£n }, with iV outer 
symbols. We assume the outer code is a linear-time encodable/decodable near MDS error-correction code 
of rate r a E [0, 1]. That is, the outer code can recover the source message from a codeword with d symbol 
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erasures and t symbol errors, so long as 2t + d < (1 — r Q — Co)N a , where Co > is a positive constant 
that can be made arbitrarily small. An example of such linear complexity error-correction code was 
presented by Guruswami and Indyk in [11]. Each outer symbol can take exp (jj-f^ possible values. 
Define Ni = jj-, Ri — The encoder then uses a set of random fountain codes each with exp(iVji?j) 
codewords to map each outer symbol into an inner codeword, which is an infinite sequence of channel 
input symbols {xki, Xk2, • • ■}• Let Cg k \^k)j be the j th codeword symbol of the k th inner code in codebook 
Cg k \ where 9 is the codebook index as introduced in Section HD We assume 9 is generated according to 
a distribution i? such that random variables x^j : 9 — > Cg\^k)j are i-i-d. with a pre-determined input 
distribution p x . To simplify the notations, we have assumed N, N Q , NR, and NRi should all be integers. 
We also assume N Q > N { > 1. 

After encoding, the inner codewords are regarded as N Q channel symbol queues, as illustrated in Figure 
[2l In the I th time unit, the encoder uses a random switch to pick one inner code with index ki(6) uniformly, 
and sends the first channel input symbol in the corresponding queue through the channel. The transmitted 
symbol is then removed from the queue. We assume random variables ki : 9 — > {1, 2, . . . , N Q ] are i.i.d. 
uniform. We assume the decoder knows the outer codebook and the code libraries of the inner codes. We 
also assume the encoder and the decoder share the realization of 9 such that the decoder knows the exact 
codebook used in each inner code and the exact order in which channel input symbols are transmitted. 

Decoding starts after N = N Q Ni channel output symbols are received. The decoder first distributes 
the received symbols to the corresponding inner codewords. Assume ZkN channel output symbols are 
received from the fcth inner codeword, where z\~ > and ZkN is an integer. We term Zk the normalized 
effective codeword length of the A;th inner code. Based on and the received channel output symbols, 
{ykinVkin ■ ■ ■ ■, Uki ZkN . }> the decoder computes the maximum likelihood estimate ^ of the outer symbol 
together with an optimized reliability weight a fc G [0, 1]. We assume, given z k and {yu^y^ ■ ■ ■ ■> Vki ZkN . }, 
reliability weight a k is computed using Forney's algorithm presented in [9, Section 4.2]. After that, the 
decoder carries out a generalized minimum distance (GMD) decoding of the outer code and outputs an 
estimate w of the source message. GMD decoding of the outer code here is the same as that in a classical 
communication system, the detail of which can be found in [12]. 

Compared to a classical communication system where all inner codes have the same length, in a 
concatenated fountain coding scheme, numbers of received symbols from different inner codes may be 
different. Consequently, error exponent achievable by one-level concatenated fountain codes is less than 
Forney's exponent. 

Theorem 2: Consider fountain communication over a discrete-time memoryless channel py\x with 
fountain capacity Cf- For any fountain rate R < Cp, the following fountain error exponent can be 
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arbitrarily approached by one-level concatenated fountain codes. 



E Fc (R) = max (1 — r c 

PX,£;<ro<l,0<p<l 



R 

-p— + E (p,p x ) 
r n 



1 - + n r ° Eo{p,Px) 



(5) 



where E (p,p x ) is defined in ©. 

Encoding and decoding complexities of the one-level concatenated codes are linear in the number of 
transmitted symbols and the number of received symbols, respectively. ■ 

The proof of Theorem [2] is given in Appendix lAl 

Corollary 1: E Fc (R) is upper-bounded by Forney's error exponent E C (R) given in [9]. E Fc (R) is 
lower bounded by E Fc (R), defined by 



E Fc (R) = max (1 — r c 

Px,-r-<r o <l,0<P<l 



R 

-p— + E (p,p x ) [1 - E Q (p,p x )] ) ■ 
r 



The lower bounds is asymptotically tight in the sense that 

R^c F E Fc (R) 



(6) 



(7) 



The proof of Corollary Q] is given in Appendix [Bj 

In Figure [3l we illustrate E Fc (R), E C (R), and E Fc (R) for a BSC with crossover probability 0.1. We 
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Fig. 3. Comparison of fountain error exponent Ef c (R), its upper bound E C (R), and its lower bound Efc(R)- 



can see that E Fc (R) is closely approximated by E Fc (R). 

Extending the one-level concatenated fountain codes to the multi-level concatenated fountain codes is 
essentially the same as in classical communication systems [10] [12] except random fountain codes are 
used as inner codes in a fountain system. Achievable error exponent of an m-level concatenated fountain 
codes is given in the following Theorem. 
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Theorem 3: Consider fountain communication over a discrete-time memoryless channel py\x with 
fountain capacity C F . For any fountain rate R < C F , the following fountain error exponent can be 
arbitrarily approached by an m-level concatenated fountain codes, 



E F l (x, px 



R 



max 

Px,^<ro<l -A. v 
p r m '—' 



1=1 



E 



FL 



-1 ' 



max (-px + E (p,p x ) [1 - E {p,p x )}) . 

0<p<l 



(8) 



where E (p,p x ) is defined in ©. 

For a given m, encoding and decoding complexities of the m-level concatenated codes are linear in the 
number of transmitted symbols and the number of received symbols. ■ 

Theorem [3] can be proved by following the analysis of m-level concatenated codes presented in [10] [14] 
and replacing the inner code error exponent in the analysis with the error exponent lower bound given in 
Corollary [Q 

Corollary 2: The following fountain error exponent can be arbitrarily approached by multi-level con- 
catenated fountain codes with linear encoding/decoding complexity. 



^Fc 



(R) 



R 



px,-^<r <i \r Q 



max 



-R 



dx 



o E FL (x,p X/ 



(9) 



where E FL (x,p x ) is defined in ([8]). ■ 

In Figure HI we illustrate Ep^(R) and the Blokh-Zyablov exponent E^°°\R) for a BSC with crossover 
probability 0.1. It can be seen that E { £\R) is not far away from the Blokh-Zyablov exponent. 
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Fig. 4. Comparison of mulit-level fountain error exponent E^\R) and the Blokh-Zyablov exponent Ec^^R) 
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V. Rate Compatible Fountain Communication 

Consider the application of software patch distribution. When a significant number of patches are 
released, the software company may want to combine the patches together as a service pack. However, 
if a user already have some of the patches, he may only want to download the new patches, rather than 
the whole service pack. For the convenience of the patch server, all patches of the service pack should be 
encoded jointly. But for the communication efficiency of each particular user, we also want the fountain 
system to achieve the same rate and error performance as if only the novel part of the service pack 
is transmitted. We require such optimality be achieved simultaneously for all users, and define such a 
fountain communication model the rate compatible fountain communication. 

Assume a source message w, which takes exp(iV.R) possible values, can be partitioned into L sub- 
messages w = [wi,w 2 , ■ ■ ■ , w L ], where Wi, Vz, can take exp(iVrj) possible values, J2i r % — R- Consider the 
following extended one-level concatenated fountain coding scheme. For all i G {1, . . . ,L}, the encoder 
first uses a near MDS outer code with length N a and rate r Q to encode sub-message wi into an outer 
codeword . . . , ^n }, as illustrated in Figure [51 Next, for all k G {1, ... , N Q }, the encoder combines 

i I 
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Outer codewords Inner codewords Encoded symbols 
Fig. 5. Concatenated fountain codes for rate compatible communication. 



outer codeword symbols {£ lfc , . . . , ^ Lk ] into a macro symbol £ fc = [£ lfc , . . . , ^ Lk \. A random fountain code 
is then used to map into an infinite channel input sequence {xki, Xk2, • • •}• 

Without loss of generality, we assume the decoder already has sub-messages {u>z+i, . . . ,wl}, where 
I G [1,L — 1] is an integer. The decoder estimates the source message after N t = N ^^ 1 channel 
output symbols are received. From the decoder point of view, since the unknown messages [wi, . . . , wj\ 
can only take exp(A^X)i=i r i) possible values, the effective fountain information rate of the system is 



R. According to the known messages [u>/+i, . . . , Wl], the decoder first strikes out from 



fountain codebooks all codewords corresponding to the wrong messages. The one-level concatenated 



fountain code is then decoded using the same procedure as described in Section [TV] Assume the average 
number of symbols received by each inner codeword iVj = = ^- n is large enough to enable 



9 



asymptotic error performance analysis. By following a similar analysis given in the proof of Theorem [21 
it can be seen that error exponent E Fc (R) given in © can still be arbitrarily approached, irrespective of 
the value of /. 

Therefore, given a rate partitioning R = [n, . . . , ri], the encoder can encode the complete message 
irrespective of the sub-messages known at the decoder. The fountain system can achieve the same rate 
and error performance as if only the unknown sub-messages are encoded and transmitted. Extending the 
scheme to multi-level concatenated codes is straightforward. 



In previous sections, we have assumed that concatenated fountain codes should be optimized based on 
a known memoryless channel model py\x- However, such an optimization may face various challenges in 
practical applications. For example, suppose a transmitter broadcasts encoded symbols to multiple receivers 
simultaneously. Channels experienced by different receivers may be different. Even if the channels are 
known, the transmitter still faces the problem of optimizing fountain codes simultaneously for multiple 
channels. For another example, suppose the source message (e.g., a software patch) is available at multiple 
servers. A user may collect encoded symbols from multiple servers separately over different channels and 
use these symbols to jointly decode the message. By regarding the symbols as received over a virtual 
channel, we want the fountain system to achieve good rate and error performance without requiring the 
virtual channel model at the transmitter. We term the communication model in the latter example the 
rate combining fountain communication. In both examples, the research question is whether key coding 
parameters can be determined without full channel knowledge at the transmitter. 

Consider fountain communication over a memoryless channel py\x using one-level concatenated foun- 
tain codes. We assume the channel is symmetric, and hence the optimal input distribution p x is known 
at the transmitter. Other than its symmetry, we assume channel information py\x is unknown at the 
transmitter, but known at the receiver. Given p x , define I(px) = Y) as the mutual information 
between the input and output of the memoryless channel. We assume the transmitter and the receiver 
agree on achieving a fountain information rate of ^I(px) where 7 is termed the normalized fountain rate, 
known at the transmitter. 

Recall from the proof of Theorem [2] that, if p Y \x is known at the transmitter, the following error 
exponent can be arbitrarily approached. 



VI. Fountain Communication over An Unknown Channel 



E Fc (l,Px) 




E F c(l,Px,r ) 



1_ E (p,px) 
r Q I{px) 
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1 + r, 
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E (p,Px) 



(10) 
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Without p Y \x at the transmitter, we set the outer code rate r Q at r Q = ^ 7 ^ 87 - and define the corre- 



sponding error exponent by 



E Fcs {^,p x ) = E Fc 'j,p x ,r 



V7 2 + 87 - 7 



(ID 



The following theorem indicates that Ep^i^ ,p x ) approaches E Fc {^^p x ) asymptotically. 

Theorem 4: Given the memoryless channel py\x an d a source distribution px, the following limit 
holds, 

(12) 



limf Wr^x) =L 



7-1 E Fc (ry,p x ) 



The proof of Theorem |4] is given in Appendix |Cj 

In Figure [6l we plot £7^(7, p^) an d -^f c (7,Px) for BSC with crossover probability 0.1. It can be 
seen that setting r Q at r G = — — ^ i s near optimal for all normalized fountain rate values. Further 
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Fig. 6. Error exponents achieved by optimal r and suboptimal r = Vt + 8 t _t versus normalized fountain rate 7. 



discussions on fountain communication over unknown channels is outside the scope of this paper. 



VII. Conclusions 

We proposed concatenated fountain codes with linear coding complexity for fountain communication 
over a discrete-time memoryless channel. Fountain error exponents achievable by one-level and multi-level 
concatenated codes were derived. It was shown that the fountain error exponents are less than but close 
to Forney's and Blokh-Zyablov exponents. In rate compatible communication where decoder knows part 
of the message, with the encoder still encoding the complete message, concatenated fountain codes can 
achieve the same rate and error performance as if only the unknown part of the message is encoded. For 
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one-level concatenated codes and for some channels, it was also shown that near optimal error exponent 
can be achieved with an outer code rate independent of the channel statistics. 

Appendix 

A. Proof of Theorem \2\ 

Proof: We first introduce the basic idea of the proof. 
Assume the decoder starts decoding after receiving N = N Ni symbols, where N a is the length of the 
outer codeword, Ni is the expected number of received symbols from each inner code. In the following 
error exponent analysis, we will obtain asymptotic results by first taking N a to infinity and then taking 
Ni to infinity. 

Let z be an iV -dimensional vector whose kth element Zk is the normalized effective codeword length 
of the kth inner code, from which the conditional empirical distribution function F z \g can be induced, as a 
function of variable z > 0, given the random variable 9 specified in Section [Illj Let the conditional density 
function of Fz\g be fz\e- Note that the empirical density function fz\o itself is a random variable, whose 
distribution is denoted by G F , as a function of f z \e- Assume, given 9, the conditional error probability 
of the concatenated code can be written as P e \e{fz\e) = ex p(—NiN Ef(f z \e,R)), where the conditional 
error exponent Ef(fz\g, R) is a function of fz\o- The overall error probability can therefore be written as 

P e = / eM-N,N E f {fz\e,R))dG F {fz\e). (13) 
Consequently, error exponent of the concatenated code is given by 

E Fc (R) = lim lim -J—\ogfexp(-N t N E f (fz ls ,R))dG F (f z \e) 

Ni->ooN ->oo iVjiV a J 9 

= mm\E f (f z ,R)- lim -J- log dG F (f z )\ , (14) 

fz I N it N ^oo l\ i l\ J 

where in the second equality we wrote fzw as f z to simplify the notation. 

The rest of the proof contains three parts. In Part I, we derive the expression of lim^r t N ->oo jjrjf log dG F (fz)- 
In Part II, we derive the expression of Ef(f z , R). In Part III, we use the results of the first two parts to 
optimize ([141) and to obtain E Fc (R). 

Part I: Let dz > be a small constant. We define {z g \z g = ndz, n = 0, 1, . . . , } as the set of "grid 
values" each can be written as an non-negative integer multiplying dz. Given a normalized effective inner 
codeword length vector z, the empirical density f z is induced as follows. We first quantize the elements 
of z, say Zk, to the closest grid value no larger than Zk, i.e., z g < z^. Denote the quantized z vector by 
z (<?) p or an y grid value z g , we define X Zg = {i\zl = z g } as the set of indices corresponding to which the 
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elements of z^ vector equal the particular z g . Given z, the empirical density f z is a discrete function 



\i I 

defined on the grid values, with fz{z g ) = jf^- According to the f z definition, we have 



^2fz(z g )dz = 1. 



(15) 



Let z(i) be an iV -dimensional vector with only one non-zero element corresponding to the ith received 
symbol. If the zth received symbol belongs to the fcth inner code, then we let the kth element of z{%) 
equal 1 and let all other elements equal 0. Since the random switch (illustrated in Figure [2]) picks inner 
codes uniformly, we have 



E W) \ = -1, cov[z W ] = - -,11 



T 



(16) 



where 1 is an iV -dimensional vector with all elements being one. According to the definitions, we have 



Y,i=i ° z(i). Since the total number of received symbols equal NiN Q , we must have 1 z = N a . 



This implies that for all empirical density functions fz, we have 

Zgfz{z g )dz e [1 - dz, 1]. 



(17) 



Since z equals the summation of NiN a independently distributed vectors z(i), the characteristic function 



of 



1), denoted by <fz(t), can be written as 



1 



1 - -t 



1 



2 \ 1 N Ni 



2' N^N, 



lN -lt + O 



(18) 



where t = [t 1 , ■ ■ ■ , t No _i] T is an (N —l) -dimensional vectojj] with bounded elements, i.e., max 1 < fc < A r o _ 1 \t k \ < 
a, for some constant a > 0. (fT8l implies that 



^im {M*)-exp(-^*)} = 0. 



(19) 



Consequently, for large Ni,N a , N Q 3> A^, the probability that z gives a particular quantized vector z^ 
is upper bounded by 



Ni 



N -l 



Pr{z®} < dzJ— — 



N n 



N 



2tt 



Ni 



2>N (dz) 



(20) 



The first term on the right hand side of (T20l) is the volume of the neighborhood of J jf-(z — 1) in which 



the quantized codeword length vector of z equals z^ q \ The second term is a revised Gaussian density 

'Note that because 1 T z — N , z has only N — 1 linearly independent elements. The characteristic function is obtained by first projecting 
z to an (N — 1) -dimensional space. The detailed derivation is skipped. 
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derived from the characteristic equation exp (^—^j-t T tj. Note that the offset — 3N a (dz) 2 in the second 
term is necessary to ensure the validity of the upper bound. 

Let fz be the empirical density induced from a particular quantized codeword length vector z^ q \ It can 
be shown that the probability for z to follow an empirical density fz is upper bounded by 

Pr{f z } = K (N h N o )Pr{z^} 

I — \ JVo ~ 1 

< K (N t ,N o )[dz^\ eX p(-^[||^-l|| 2 -3iV (^) 2 

K (N it N Q ) [ dzJ^) exp QaWo(^) 2 ) exp (-^ J> 9 - l)'fz(z g )dz ] , 



(21) 



where K (Ni, N a ) is a permutation term that satisfied lim iVi ,Ar -+oo logi ^ i ( ^' jV ° ) = 0@ 
From (f2T|) . we can see that for all fz the following inequality holds, 



Part II: Next, we will derive the expression of Ef(f z ,R), which is the error exponent conditioned 
on an empirical inner codeword length density fz- 

Let z be a particular iV -dimensional inner codewords length vector, which follows the density func- 
tion f Z - Given a finite dz, since error probability conditioned on f z can be written as P e (fz) = 
exp(—NiN Ef(f z ,R)), error probability given z can be written as 

exp(-NiN E f (f Z} R)) . M 

Pe( * ) = ^iv-Ag ' ^fcc^'^ = ' (23) 

where K\(Ni, N ) is a permutation term. Consequently, we can obtain Ef(fz, R) by assuming a particular 
inner codeword length vector z, whose corresponding inner codeword length density is fz- 

A strict error exponent derivation should proceed by first assuming a fixed dz. For each grid value z g , 
error performances of inner codes whose effective codeword lengthes belong to (z g ,z g + dz] should be 
bounded as function of R and z g . After obtaining the overall error exponent of the concatenated fountain 
code based on a fixed dz, asymptotic result can then be obtained by taking dz — ► 0. The order that dz — > 
should be taken at the end is necessary since otherwise the previous derivations such as (T20l) . (T2TT) . (T23T) 
are no longer valid. However, under the assumptions that the previous derivations are valid, taking dz — >• 
first does not affect the validity of the rest of the proof. Therefore, in order to simplify the notations, 
from now on, we will first take dz — > 0. 

2 Validity of this limit can be shown by jointly consider the two terms in d!4t . The detail is skipped. 
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By taking dz — > 0, (1221) becomes 



log Pr{f z } p (z- l) 2 
i™, ^7 = / o fz{z)dz. (24) 



dz->0,Ni,No->co NiN Q Jo 2 

According to the definition of fz and the property that l T z = N a , we also have 

fz(z) = l, / zf z (z) = l. (25) 
Jo 

Consequently, error exponent (fT4l) becomes 

Efc(R) = r min i2) + /°° fc^/^d*} . (26) 

fz,f™ *fz{*)=l { Jo 2 J 

Assume the outer code has rate r a , and is able to recover the source message from dN a outer symbol 
erasures and tN Q outer symbol errors so long as d + 2t < (1 — r Q — £ ), where Co > is a constant that 
can be made arbitrarily small. To simplify the notations, we take Co -> firsfl Assume, for all fc, the A;th 
inner code reports an estimate of the outer symbol ^ together with a reliability weight a k £ [0, 1]. Apply 
Forney's GMD decoding to the outer code [12], the source message can be recovered if the following 
inequality holds [9, Theorem 3.1b]. 

a k ^ k > r N , (27) 

fe=i 

where fj, k — 1 if £ fc = £ k , and fM k = — 1 if i k 7^ 6c- Consequently, error probability conditioned on the 
given z vector is bounded by 



P e (R,r ,z) < Pr^pa k ^ k <r N^ 



E lexp (sNi J2 k =i a k^k 

< min 



(28) 



*>o exp(-sNir N ) 
where the last inequality is due to Chernoff's bound. 

Given the inner codeword lengths z, random variables a k fi k for different inner codes are independent. 
Therefore, (|28T) can be further written as 

V ; - ->o exp(-siV> iV ) 

. exp(E£ilog^[exp(-s^a^ fe )]) 

= mm 7 77 7TT -. (29) 

s>o exp(-siV> iV ) 
Now we will derive the expression of \ogE [exp (— sNia k fj, k )} for the /cth inner code. 
Assume the normalized effective codeword length is z k . Given z k , depending on the received channel 
symbols, the decoder generates the maximum likelihood outer code estimate and generates a k using 

3 Taking fo — > requires a significant increase of JVj [11]. Although the linear complexity argument requires (o be taken after 
N — ► oo, switching the order of these asymptotic operations does not affect the validity of the error exponent result. 
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Forney's algorithm presented in [9, Section 4.2]. Define an adjusted error exponent function E z (z) as 
follows. 

R 

E z (z) = max -p h zE (p,p x ), (30) 

o<p<i r Q 

where E (p,p x ) is defined in ((U). By following Forney's error exponent analysis presented in [9, Section 
4.2], we obtain 



log£ [exp (-sNiCtkfJ>k)] = max[mm{NiE x (zk), Ni(2E z (z k ) - s), Nis}, 0]. 



(31) 



Define a function s) as follows, 

-sr D z,E z (z) < s/2 

<j)(z,s) = l 2E z (z)-(l + r )s z,s/2 < E z (z) < s • (32) 
(1 - r Q )s z,E z (z) > s 

Substitute (|3TT) into (|291 , we get the expression of the conditional error exponent Ef(f z , R) as 

E f (fz,R)= max </>(z,s)f z {z)dz. (33) 

PX,^<r <l,8>QJ 

Part III: Combining (|33l ) with (|26l) , fountain error exponent of the concatenated code is therefore 
given by 

f \ (1 _ - 2 ) 2 " 
^f c (^) = max min / s) + 

»r.^-<r„<l.s>0 fy.f zfr(z)dz=l J 2 



fz{z)dz. (34) 



PX,<£;<r o <l,s>0 f z J™ zf z {z)dz=l ■ 

Assume /| is the inner codeword length density that minimizes E Fc (R) in (|34l) . Assume we can find 
< A < 1, and two density functions fjp, f z \ satisfying J °° zf% V \z)dz = 1, / °° zf z 2 \z)dz = 1, such 
that 

f z = \f z 1] + {l~\)f [ z ) . (35) 

It is easily seen that Ep c (R) should be minimized either by fjp or which contradicts the assumption 
that f z is optimum. In other words, if /| is indeed optimum, then a decomposition like (|35l) must not 
be possible. This implies that f z can take non-zero values on at most two different z values. Therefore, 
we can carry out the optimization in (|34l) only over the following class of fz functions, characterized by 
two variables < z < 1 and < 7 < 1. 

f z (z) = jS(z - z ) + (1 - 7)5 (z - ■ (36) 

where 5Q is the impulse function. 
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Now let us fix p x , r Q , 7, and consider the following optimization of E Fc (R,p x ,r ,'j). 



E Fc (R,p x ,r ^) = min max 70(^0, s) + (1-7)0 

0<zq<1 s >0 



1 - zq7 
. 1-7 



,s\ + 



7 

1-7 2 



(37) 



Since given z , 70(zo ; s) + (1 — 7)0 ( 1 1 f ° 7 , s) is a linear function of s, depending on the value of 7, the 
optimum s* that maximizes ((371) should either satisfy s* = E z (zq) or s* = (33 



When 7 > ^2^, we have s* = E z (z ). This yields 



E Fc (R,p x ,r ) = min 

0<20i7<l 



7 (1 - z ) ; 



1-7 2 

When 7 < i^ 2 -, we have s* = E z {^z^j, we have 



+ (1 - r o )EJz ) 



(38) 



E Fc (R,p x ,r ) 



mm 

0<zo,7<l 



> min 

0<20,7<1 



70(^o, s) + - 7 (1 /° )2 + (1 - 7 )(1 - r )E z 



2jEJz ) + 



1-7 2 

7 (l-^o) 2 



+ (1 - r a - 2 7 )E Z 



( 1 - 72:0 
V 1-7 , 
1 - 7^o\ 



1-7 2 v " \ I-7 

By substituting E z (z) = max < p <i[— pf- + zE (p,p x )] into (1391 ), we have 

R 

-p— + E (p,px) 



(39) 



E Fc (R, »y, r„) > min max <^ (1 — r 

V ' 0< 20 ,7<10<p<l 1 V 
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Note that if (l + r o )(l-z )£ (p,p x )- 



1-7 



l + r„)(l - z )£b(p,Px) 



(1 " zof 



(40) 



< 0, wehave£ Fc (.R,px,r ) > (l-r ) -p^ + E (p,p x 



which is Forney's exponent given p x ,r Q . This contradicts with the fact that Forney's exponent is the 
maximum achievable exponent for one-level concatenated codes in a classical system [9]. Therefore, we 
must have (1 + r G )(l — z )E (p,p x ) — ^—§^- > 0. Consequently, both (1381) and (l40l) are minimized at 

( l-r o (l-z ) 2 l 



Y = i^p. and 



£f C (#) 



max min < (1 — r o )£ , 2 (2 ) + 



^-<r <l,p x °< 2 o<i 



l + r 



max 



C^<ro<l,Pxfl<P< 



R 



min {(1 - r ) f-p— + E (p,p x ] 
i0<z <i 1 V r Q 

1 - r (l - z ] 



[(1 -z )- 2(1 + r o )E (p,p x )] . 



1 + r 2 

The last step is to optimize (|4Tj) over 2 . Note that if (1 + r o )E > 1, for a fixed r c , we have 

i?^ . l-r„ 



{ R 1 — r G 1 

£><=(#, r ) < max -p— (1 - r ) + — - - \ 



(41) 



(42) 



which implies p = 0. But p = implies (l + r o )i? = < 1 which contradicts the assumption (l+r o )E > 
1. Therefore, we can assume (1 + r o )E < 1. Consequently, substituting z$ = 1 — (1 + r o )E into (|4T|) 
gives the desired result ©. 
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To achieve linear coding complexity, we fix N{ at a large number and only take N a to infinity. According 
to [11], it is easy to see that the encoding complexity is linear in the number of transmitted symbols. 
At the receiver, we keep at most 2A r i symbols for each inner code and drop the extra received symbols. 
Consequently, the normalized effective codeword length of any inner code is upper-bounded by 2. Because 
(1381 ) and (|40l) are both minimized at 7* = pp, according to (1361 ), the empirical density function fz(z) that 
minimizes the error exponent takes the form fz(z) = ^—^-biz — z ) + ^^-S (z — 2 ~ z 1 °|^" r °- > ^. The second 
term implies z = 2 ~ z 1 °j^~ r °- > < 2. Therefore, upper bounding the effective codeword length by 2 does 
not change the error exponent result. However, with z k < 2, Vk, the decoding complexity of any inner 
code is upper-bounded by a constant 0(exp(2A^)). According to [12], the overall decoding complexity 
of the concatenated code is therefore linear in N a , and hence is linear in N. Since fixing Ni causes a 
reduction of Ci > in the achievable error exponent, and both ( , d can be made arbitrarily small as we 
increase Ni, we conclude that fountain error exponent Ep c (R) given in ([5]) can be arbitrarily approached 
by one-level concatenated fountain codes with a linear coding complexity. ■ 

B. Proof of Corollary [7] 

Proof: Because < r Q < 1, it is easy to see Efc(R) < Ep c (R) < E C (R). We will next prove 



lim 



E Fc {R) 
R^Cf E Fc (R) 



Define 



1. 



R 

g{Px, r Q , p) = (1 - r ) ( -p— + E (p, p x ) 



1 l + r °i? ( 

1 n — EoiP'Px, 



(43) 



such that 



E Fc {R) 



max g(p x ,r ,p). 



Px,7^<ro<l,0<p<l 



Use Taylor's expansion to expand g(p x ,r ,p) at r = 1 and p = 0, we get 



g(px,r ,p) =J2 ttz i\] P( i >j)(' r ° ~~ 



where f3(i,j] 



(44) 



(45) 



r o =l,p=0 



, with % and j being nonnegative integers. It can be verified that 



R 



13(1,0) = {p--E (p } p x )+r o Et(p,p x ) 



r o =l,p=0 



m^) = {-p 2 4 + Ei(p,p x ) 



(3(i,0) 



i\(-lYR' 



= 0, 

r o =l,p=0 

= 0, Vi > 3. 



(46) 



r o =l,p=0 



It can also be verified that 



a fn i\ fi \ ( R . dE o(P,Px) dE (p,p x ) 

/3(0, 1) = (l-r D ) I-— + ^ ^ E (p,px){l+r o ) 

(3(0,3) = iX-ro)hj{R,p,r )\ ro=ltP ^ = 0, Vj > 2, 

where hj(R,p,r a ) is a function of R,p,r Q . We also have 



aft i\ \ R dE o(p,Px) LO „ f ,dE (p,p x )\ 
P0-,l)={-2 Qp + 2r o E (p,p x ) 

(3(2, 1) = -2R ± 0, 
/3(1,2) = - 



d 2 E (p,p x )_ _{ dE (p, Px )\ 



dp 2 



dp J 



dp J 

7^0. 



= R 



r o =l,p=0 



p=0 



Similarly, define 



such that 



g(Px,r Q ,p) = (I -r Q ) (-p— + E (p,p x )[l - E (p,px)]) , 



max g(p x ,r ,p). 



Px,^<r o <l,0<p<l 

Use Taylor's expansion to expand g(px,r ,p) at r G = 1 and p = 0, we get 



g(px,r ,p) =J2 rl -\M hj){ r o - i)V- 



where (3(i,j) 



d( i+i) g{px,ro,p) 



r o =l,/s=0 



It can be verified that 



/3(1,0)= p^-£ (p,Px)+£ (P,Px) 



0, 



J r o =l,p=0 

= 0, Vi > 2. 



r o =l,p=0 



It can also be verified that 



M0, 1) = (1 - r„) (-£ + - ^2|M 2Eo(ARY) > 



r =l, ( 



/3(0,j) = (l-r )^(i?,p,r ) 



r o =l,p=0 



0, Vj > 2, 



where hj(R,p,r ) is a function of R,p,r Q . We also have 



1) = 



dp 



/3(2,1) = -2 J R^0, 



fe2H _{^)_ 2 (^)) 2 } 



<9p J 



r o =l,p=0 



p=0 
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Because 0(1, 1) = 0(1, 1), 0(2, 1) = 0(2, 1) ^ 0, 0(1, 2) = 0(1, 2) ^ 0, by L'Hospital's rule, we have 

lim ^ Fc{R) lim 1)(r ° ~ 1)P + ^ (2 ' 1)(r ° ~ 1)2p + 2)(r ° " 1)p2 I 

^ E Fc (R) R^c F ,™ i,„-o 1/3(1, l)(r - l)p + §0(2, l)(r G - Ifp + ±0(1, 2)(r c - l)p 2 

(55) 



C. Proof of Theorem |4] 
Proof: Define 



gin, r o, p) = (i - r ) ( p/(px) (1 - ^p) 



E Fc {j,Px, r Q ) = max g(j, r Q , p), 

0<p<l 

E Fc (-f,p x ) = max E Fc (-f,p x ,r ). 

0<r o <l 

We will first prove that 

lim iW7,^) =L 

E Fc ^,p x ) 

Note that g(j,r , p) is maximized at p = p*, with 

Ilpx) (l - i) 



p^_ ( d 2 E (p, Px ) 
2 



dp 2 



2I 2 ( P x] 



p=0 



(56) 



(57) 



P 



d 2 E (p,p x ) 
dp 2 



p=0 



2P(p x ) 



(58) 



where we have assumed < p* < 1. This assumption is valid when r a is also optimized. Consequently, 

Efc{1iPx, r o) is maximized at r Q = r*, with 



r* = argmax(l — r Q ) f 1 — 



Therefore, 



0<r o <l 



lim — > lim 

7-1 E Fc {j,p X ) 7"~ *1 



7 



a/7 2 + 87 - 7 



E Fcs (ry,p x ,p) 



g(l,Px,P,r ) 



1. 



p=p*,r =rj_ 

Following a similar idea as the proof of Corollary \T\ it can be shown that 



lim 14*1*4 = 1. 
7-1 E Fc {>y,px) 



Combining (l60l) and (l6"TT) . we get 



lim f^IlM = l im El^hlA Um |4^4 > ! . 
7-1 E Fc (-f,px) 7-1 E Fc (-f,p x ) E Fc (7,p x ) 

Because £ Fcs ( 7 ,p x ) < E Fe (>y,px), ^ implies linx^ = 1- 



(59) 



(60) 



(61) 



(62) 
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